Sunday, April 06, 2008

Reproducible Research

There were a couple of recent posts on Greg Wilsons's blog about reproducible research. The first is a pointer to a paper about WaveLab. The paper is from 1995, although it appears the philosophy hasn't changed much looking at a more recent (2005) introduction on the website. The second post contains a reference to the Madagascar project.


These projects follow the work of Jon Claerbout on Reproducible Research.


Thoughts


  • This work is all about designing a workflow for computational research, and creating tools to support that workflow. Much of it seems quite banal, but part of the whole purpose of doing this is to free you from having to think about the boring and repetitious steps.
  • Levels of reproducibility

    1. Yourself, right away (when tweaking figures, for instance)

    2. Another research group member, or yourself 6 months later

    3. Someone in another group, (a) able to run the code and get the same results, and (b) able to understand the code

    4. Someone in another group, able to use your description to write their own code that gets the same results


    The last level is the gold standard in scientific reproducibility. Experiments are reproduced from published descriptions (ideally, although the descriptions aren't always complete). Or theorists rederive intermediate steps leading to a paper's conclusion.

    In software development, levels 1-3 are what counts. (It's not very useful software development if others can't run your code and get expected results.). The last level is similar to multiple implementations of a standard (eg, C++ compilers from multiple vendors.)

  • The referenced works seem to focus mostly on processing of raw data (images, seismic imaging, etc) and the processing steps to produce a final result.


    How would this apply in other fields? Say, ones with smaller input data sets, but much higher computational demands.


    The Madagascar project, for example, uses SCons to coordinate the steps in producing research output. What happens when one of the steps involves several weeks of supercomputer time? Two immediate problems that spring to mind - one probably doesn't want this step launched without warning by the script. Furthermore, the steps for launching such a job are likely to be very site-specific, making it difficult for others to reproduce elsewhere.


  • Somewhat related entries of mine: Keeping notes and Solving Science Problems